荷兰专利NL2026432A Multi-source target tracking method for complex scenes

专利PDF首页>>荷兰专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
The present invention relates to the ﬁeld of artiﬁcial intelligence technology, and provides a multi-source target tracking method for complex scenes, including the following steps: forming 5 an initial deep learning network, setting weights of the initial deep learning network, performing steepest gradient dimensionality reduction on the weights of the initial deep learning network, and forming an initial feature matrix of a multi-source target, performing simpliﬁed sparse representation and sparsif1cation processing on the initial feature matrix of the multi-source target to obtain a sparsified target feature matrix, and classifying the sparsified target feature 10 matrix, and establishing feature matrix representations of different feature targets, A multi-source target feature matrix library established through the deep learning network in the present invention has a good modeling characteristic and data statistical structure, and a multi-source target tracker model is generated for unlabeled training data. 12
公开号:NL2026432A
申请号:NL2026432
申请日:2020-09-09
公开日:2021-05-11
发明作者:Wan Ling；Wang Feng；Guan Qingyang；Zhang Renhui
申请人:Shenzhen Demio Tech Co Ltd；
IPC主号:

专利说明:

[0001] [0001] The present invention relates to the field of data processing technology, and in particular relates to a multi-source target tracking method for complex scenes based on deep learning.
[0002] [0002] A multi-source target tracking method for complex scenes can provide accurate target data features, and achieve tracking of a multi-source target by constructing a deep learning processing architecture based on feature matrix learning. Specifically, a template model which is autonomous, controllable, and addable, is formed by training a feature matrix of many typical simulation scenes. Such a model also has a function of enhancing the feature matrix of the scenes, and may achieve enhancement of special features such as shadows and texture of the scenes.
[0003] [0003] As a signal transformation method, feature matrix learning may achieve approximate representation of high-dimensional space features by low-dimensional space vectors according to the basis vectors of a complete feature matrix. However, at present, when feature matrix atoms are updated by feature matrix learning itself, it is difficult to achieve the goal of reducing the dimensionality of the feature matrix, and a large amount of calculation is involved when feature selection is performed on the image target. Moreover, one of the core issues of feature matrix learning is sparse representation. To improve the sparse representation of feature matrix learning, a fixed feature matrix basis vector usually needs to be updated in designing of the feature matrix. Whether a complete feature matrix is designed determines whether a real signal is represented more approximately. Therefore, there is currently a need for a tracking processing method that reduces the dimensionality of a feature matrix and more approximately represent a real target.
[0004] [0004] An objective of the present invention provides a multi-source target tracking method for complex scenes based on deep learning, intended to solve the technical problems in the prior art that the dimensionality of a feature matrix is difficult to reduce and a real target cannot be represented more approximately.
[0005] [0005] To achieve the above objective, embodiments of the present invention provide the following technical solution: 1
[0006] [0006] A multi-source target tracking method for complex scenes includes the following steps:
[0007] [0007] SI: for a multi-source target, forming an initial deep learning network, setting weights of the initial deep learning network, performing steepest gradient dimensionality reduction on the weights of the initial deep learning network, and forming an initial feature matrix library of the multi-source target;
[0008] [0008] S2: performing simplified sparse representation and sparsification processing on the preliminary features of the multi-source target to obtain a sparsified feature matrix;
[0009] [0009] S3: performing multi-source target classification on the sparsified feature matrix, and establishing feature representations of different feature targets;
[0010] [0010] S4: minimizing the sparsified feature matrix;
[0011] [0011] S5: building a generative model and a reconstruction model for feature matrix learning;
[0012] [0012] S6: performing effective multi-source tracking of the targets through the generative model and the reconstruction model.
[0013] [0013] Preferably, step SI specifically includes: building a deep learning network, and forming the initial feature matrix by using features of the multi-source target through the deep learning network, wherein the initial feature matrix has adaptive change and improvement functions, and an expression of the initial feature matrix is: 2 R =argmin = Ye ks
[0014] [0014] A 2 (1),
[0015] [0015] where R represents the feature matrix of the multi-source target, k represents the number of targets, i represents each target feature, and s represents a weight matrix of the deep learning network, and bi represents the tracked multi-source target.
[0016] [0016] Preferably, step S2 includes: performing effective feature matrix update iteration by feature decomposition, and implementing generation of sparsified features through preprocessing of the feature matrix.
[0017] [0017] Further preferably, step S2 includes: transforming an optimization problem of formula (1) into an optimal value problem of formula (2), as indicated by the following formula: R =argmin » Y, Rs ref & 3
[0018] [0018] (2),
[0019] [0019] where R represents the initial feature matrix of the multi-source target, and s 2 represents a weight matrix of the deep learning network.
[0020] [0020] Preferably, step S3 further includes a feature matrix update step, wherein the feature matrix R of formula (1) is minimized to R =argmm > YoR.s | vert i 5
[0021] [0021] = 3).
[0022] [0022] wherein the range of the feature matrix R is selected to correspond to zero positions of column k of the matrix s.
[0023] [0023] Preferably, step S4 includes: using image features Roy TV i =1.2....n} obtained in step S1 as an input of the deep learning network, letting d, be obtained corresponding reconstruction features, and calculating a reconstruction error and an average error of the multi-source target,
[0024] [0024] features obtained after feature reconstruction are expressed as:
[0025] [0025] 7 ={v, le “nv, el} 5).
[0026] [0026] Preferably, step SS includes: calculating a reconstruction error by formula (7): Et =x - xi) |
[0027] [0027] (7).
[0028] [0028] Preferably, feature data of the multi-source target is one or more of image data, radar data, communication data, and position data.
[0029] [0029] Preferably, in step S5, the generative model of feature matrix learning is a top-down generative model.
[0030] [0030] Preferably, in step S5, the reconstruction model of feature matrix learning is a bottom-up reconstruction model.
[0031] [0031] Compared with the prior art, the present invention has the following advantageous effects:
[0032] [0032] (1) In the multi-source target tracking method for complex scenes based on deep learning, first, a multi-source complete feature matrix library of video image features of massive different tracked targets is established through feature matrix learning. Specifically, in compression learning of the feature matrix, non-zero columns of original videos are selected to correspond to sparsified feature matrix atoms, thereby forming the complex feature matrix library based on deep learning, wherein the feature matrix library is suitable for a layered deep learning architecture. A top-down generative model and a bottom-up reconstruction model are built according to the MMSE criterion, and feature reconstruction is performed by using a 3 multilayer feedforward network discriminatively trained based on the criterion.
[0033] [0033] (2) The method of the present invention is applicable to various feature models of target features in different complex environments, such as a wilderness scene, an urban scene, and an open space scene, and sparse representations of feature matrix learning are established, and at the same time a complete feature matrix is formed. Furthermore, a constraint weight non-zero coefficient is established in the present invention to obtain sparse representations closer to real different scenes.
[0034] [0034] (3) The basic structure of the typical multi-layer deep learning network in the present invention is memory, storage, and knowledge networks, wherein multiple networks are completely linked to serve as feature transfer channels between layers, and meanwhile, each layer is also used for training the next feature structure. The deep learning network of the present invention has the capability of complex data modeling, including a top-down generative model and a bottom-up discriminative model. The deep learning network also has data training performance with weakly supervised learning. Therefore, the multi-source target feature matrix library established through the deep learning network in the present invention has a good modeling characteristic and data statistical structure, and a multi-source target tracker model may be generated for unlabeled training data. Brief Description of the Drawings
[0035] [0035] To describe technical solutions in the embodiments of the present invention or in the prior art more clearly, drawings for use in description of the embodiments or the prior art will be introduced briefly below. Obviously, the drawings described below only illustrate some embodiments of the present invention, and to those of ordinary skill in the art, other drawings may also be obtained based on these drawings without creative work.
[0036] [0036] Fig. 1 is a flow diagram of a multi-source target tracking method for complex scenes based on deep learning provided by an embodiment of the present invention. Detailed Description of the Embodiments
[0037] [0037] To make the technical problems to be solved, technical solutions, and beneficial effects of the present invention clearer, the present invention will be further described in details below in conjunction with the drawings and embodiments. It should be understood that the specific embodiments described herein are only used for explaining the present invention, rather than limiting the present invention.
[0038] [0038] An embodiment of the present invention provides a multi-source target tracking method for complex scenes based on deep learning, and may be used in technical fields of video 4 recognition and tracking, etc.
[0039] [0039] The method of the present invention first establishes a complete feature matrix library of multi-source video images of massive different tracking targets through feature matrix learning. The feature matrix learning of the present invention, as a signal transformation method, may achieve approximate representation of high-dimensional space features by low-dimensional space vectors according to matrix basis vectors of an over-complete feature matrix. In the embodiment of the present invention is directed to various feature models of target features in different complex environments, such as a wilderness scene, an urban scene, and an open space scene,the sparse representations of the feature matrix learning are established, and at the same time a complete feature matrix is formed. Meanwhile, a constraint weight non-zero coefficient is established by the method of the present invention, which is to obtain sparse representations closer to real different scenes. Then optimization approximation of a multi-source target feature matrix in different feature scenes is accomplished.
[0040] [0040] Referring to Fig. 1, which is a flow diagram of a multi-source target tracking method for complex scenes based on deep learning provided by an embodiment of the present invention. The tracking method includes the following steps:
[0041] [0041] SI: providing multi-source target data, such as video data, forming an initial deep learning network, setting weights of the initial deep learning network, performing steepest gradient dimensionality reduction on the weights of the initial deep learning network, and forming an initial feature matrix of a multi-source target;
[0042] [0042] S2: performing simplified sparse representation on features of the multi-source target, performing classification by using the multi-source target through the deep learning network to obtain a sparsified feature matrix;
[0043] [0043] S3: through multi-source target classification of the sparsified feature matrix, establishing feature matrix representations of different feature targets is established, wherein a target system may include: target image features, target radar features, target communication features, and target location features;
[0044] [0044] S4: minimizing the sparsified feature matrix, updating the initial feature matrix to obtain a minimized sparsified feature matrix, i.e. an output of a memory system knowledge network in the figure;
[0045] [0045] S5: building a top-down generative model and a bottom-up reconstruction model through the deep learning network, wherein the generative model is a generative template model of feature matrix learning, and the reconstructed model is a reconstruction template model of feature matrix learning model; and
[0046] [0046] S6: performing effective multi-source target tracking of the targets through the 5 generative template model and the reconstruction template model.
[0047] [0047] Specifically, the above-mentioned multi-source target feature data may be one or more of image data, radar data, communication data, and position data.
[0048] [0048] Using ground-air multi-source target tracking for complex environments as an example, a ground-air communication scene is established, and a ground station performs target tracking of 10 unmanned aerial vehicles (UAVs) through different data link features. Radar reflection cross-sections of different UAVs are different. Different UAV radar reflection cross-sections and acquired image features are used in combination with multi-source features of a communication carrier frequency to form a feature matrix network.
[0049] [0049] In this specific embodiment, step S2 includes: obtaining the UAV radar reflection cross-sections of the multi-source target, and performing classification through the deep learning network by using the UAV radar reflection cross-sections and the acquired image features in combination with the features of the communication carrier frequency to obtain a sparsified target feature matrix; and performing target mapping through the feature matrix.
[0050] [0050] Specifically, step SS includes: building a top-down generative model through the deep learning network, wherein the generative model is a joint feature model including the UAV radar reflection cross-sections and the acquired image features, in combination with the features of the communication carrier frequency, and the generative model is a generative template model of feature matrix learning, and building a bottom-up reconstruction model through the deep learning network, wherein the model is a reconstruction template model of feature matrix learning.
[0051] [0051] Further, step S1 specifically includes: building a deep learning network, forming the initial feature matrix by using features of the multi-source target through the deep learning network, wherein the initial feature matrix may implement adaptive changes and improvements according to feature requirements of the multi-source target, that is, the UAV radar reflection cross-sections, the acquired image features and the features of the communication carrier frequency, and according to subsequent iterations of matrix. An expression of the initial feature matrix is: 2 R =argmin = Y — ks
[0052] [0052] AA [4 2 (1),
[0053] [0053] where k represents the target number of the multi-source target, Re represents each target feature, and s represents a weight matrix of the deep learning network. h represents the tracked multi-source target.
[0054] [0054] In other embodiments, the initial feature matrix may implement adaptive changes and improvements according to feature requirements of the multi-source target, such as texture, and contour feature, and subsequently according to iterative initialization of the feature matrix.
[0055] [0055] Further, step S2 specifically includes: performing feature decomposition of the current multi-source target, and substituting the decomposed features into formula (1), that is, implementing two stages of iteration and update by using the initial feature matrix. In this step, the amount of calculation is increased by decomposition of each node, and in the decomposition process, problem solving is transformed into finding sparse coding that minimizes Rs (iterative update of the feature matrix). For this problem, and to reduce the risk of selecting a redundant feature matrix, the present invention simplifies a step of sparse representation, which is determined by updating matrix R columns , and if R is less than a determined threshold at that time, then row k of the feature matrix D may be processed as a zero vector. A target x would be updated with the support of the feature matrix D and a matrix coefficient y.
[0056] [0056] A core idea of step S2 is performing effective feature matrix update iteration by feature decomposition, and implementing generation of sparsified features through preprocessing of the feature matrix.
[0057] [0057] One of improvements of feature matrix learning of the present invention is sparse representation. To improve the sparse representation of feature matrix learning, a fixed feature matrix basis vector usually needs to be updated in designing of the feature matrix. Therefore, whether a complete feature matrix is designed determines whether a real signal is represented more approximately. The basic structure of the multi-layer deep learning network used in the embodiment of the present invention is memory, storage, and knowledge networks. Specifically, multiple networks are completely linked to serve as feature transfer channels between layers, and meanwhile, each layer is also used for training the next feature structure. Deep learning networks have been applied in different fields and may include many different forms of data generation models at the same time. The deep learning network of the present invention can implement complex data modeling, including a top-down generative model and a bottom-up discriminative model. This indicates that the deep learning network of the present invention has data training performance with weak supervision by establishing a multi-source target matrix library. The multi-source target feature matrix library established through the deep learning network in the present invention has a good modeling characteristic and data statistical structure, and a multi-source target tracker model may be generated for unlabeled training data.
[0058] [0058] Further, step S2 further includes: under the joint decision of the initial feature matrix R of the multi-source target and the weight matrix s of the deep learning network, an optimization problem of formula (1) may be equivalent to an optimal value problem of formula (2), as indicated by the following formula: R = arg min » YoeR. 5 |
[0059] [0059] 2),
[0060] [0060] and further includes a feature matrix of closest joint sparse representation. The core for solving this problem is determining hard threshold discrimination of columns of s to retain an amplitude discrimination threshold in each column. For example, in similarity analysis, a simple soft threshold function is used to solve formula (1). If sparse constraint convex relaxation occurs, it will be difficult to solve by calculating a simpler soft threshold. Therefore, the method of the present invention is implemented by using a determined hard decision threshold.
[0061] [0061] Further, step S4 specifically includes a feature matrix update step, which minimizes the feature matrix R of formula (1) to R = arg min > Y.eR_s | ve RE :
[0062] [0062] TE * z (3).
[0063] [0063] Here, the update range of the feature matrix R is determined by column selection of the weight matrix s. The range of the matrix R is selected to correspond to zero positions of column k of the weight matrix s. This step only uses limited prior information of s instead of the complete matrix, thereby reducing the calculation amount of feature matrix update, and effectively supporting the learning of the feature matrix update step with a limited calculation amount.
[0064] [0064] Further, step S4 specifically includes: using image features Poon = tli =1.2..n obtained in step S1 as an input of the deep learning network, letting d, be obtained corresponding reconstructed distance features, calculating a reconstruction error of the multi-source target according to the MMSE (Minimum Mean Squared Error) criterion, and calculating an average error by the following formula ee e= > €,; [oes] (4),
[0066] [0066] where e; represents an error value of each target.
[0067] [0067] In the feature learning iteration, features obtained after feature reconstruction are expressed as:
[0068] [0068] T ={v, le ‚<n.v, eV} (5). 8
[0069] [0069] where *: represents reconstruction features of the multi-source target, V represents acquired target features, and n represents a set feature threshold. Further, the step of "building a top-down generative model" in step SS includes a feature learning process with termination of a method of obtaining a parameter. An error between an average reconstruction error of the current iteration and the last average value at which the iteration will stop is used to implement the iteration. In the iterative feature learning process, as a reconstruction weight matrix of eigenvalues is more reliable, suppose M is the reconstruction weight matrix, let I be a multi-source feature on a test data set, Xu zis, i “bh 2 1} ‚ and X is extracted multiple features, reconstruction features X'n! may be expressed as:
[0070] [0070] Xd = Re constrction(X{,M) (6).
[0071] [0071] From the feature matrix, a reconstruction error is calculated:CRI
[0072] [0072] (7).
[0073] [0073] In the method proposed in the present invention, the UAV radar reflection cross-sections, the acquired image features, and the features of the communication carrier frequency are learned by compressing the feature matrix, to obtain weakly supervised learning initialization weights of the deep learning model. Through reconstruction, decision and feature selection between layers, the weights of the network are adjusted, and simplified weights of the system are finally obtained to achieve tracking of the multi-source feature target.
[0074] [0074] The deep learning network proposed in the present invention is a layered architecture. First, non-zero columns are selected to implement compression learning of the feature matrix, thereby forming a preliminary feature matrix. By establishing the layered deep learning architecture, the multi-source target for complex scenes may be tracked effectively. Scene data verification shows that the target tracking efficiency of the multi-source target tracking method of the present invention is higher than that of one algorithm tracking in a single dimension. In specific operations, in a scene of tracking multiple UAVs in a low altitude, for high-order modulation of QAM, the radar reflection cross-section is set to 1, and the tracking efficiency of the multi-source target is 99%.
[0075] [0075] Described above are only preferred embodiments of the present invention, and all modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be encompassed within the protection scope of the present invention. 9

权利要求:
Claims (1)
[1]
Conclusions
1. Multi-source target tracking method for complex scenes, comprising the following steps: S1: for multi-source target property data, creating an initial deep learning network, setting weights of the initial deep learning network, performing steepest gradient dimensionality reduction on the weights of the initial deep learning network, and forming an initial property matrix of a multi-source target, S2: performing simplified sparing representation and classification processing on the initial property matrix of the multi-source target for obtaining a sparsely created target property matrix; S3: classifying the sparse target property matrix, and establishing property matrix representations of different property targets; S4: minimizing the sparsely created target property matrix; S5: building a generative model and a reconstruction model for property matrix learning; and S6: performing effective multi-source target tracking of the targets by means of the generative model and the reconstruction model.
The multi-source target tracking method according to claim 1, wherein step S1 comprises: building a deep learning network, and forming the initial property matrix using properties of the multi-source target by means of the deep learning network. learning network, where the initial property matrix has adaptive change and improvement functions, and an expression of the initial property matrix is: 2 R = argmin = Y, - Role Verk k 2 (1).
The multi-source target tracking method of claim 1, wherein step S2 comprises: performing effective property matrix update iteration by property decomposition, and implementing generation of sparsed properties by pre-processing the property matrix.
The multi-source target tracking method according to claim 2, wherein step S2 comprises: transforming an optimization problem of formula (1) into an optimal value problem of formula (2), as indicated by the following formula:
nes EYE “4 o R = arg min» zx (2), where R represents the initial trait matrix of the multi-source target, and s represents a weight matrix of the deep learning network.
The multi-source target tracking method according to claim 2, wherein step S3 further comprises a property matrix updating step, wherein the property matrix R of formula (1) is minimized to R = arg min>, YR, s ze Rd I oo = (3) wherein the range of the property matrix R is selected to correspond to zero positions of column k of the matrix.
The multi-source target tracking method of claim 1, wherein step S4 comprises: | | Rom = {r | i = L2..n} how to use image properties ° "Obtained in step S1 as an input from the deep learning network, obtaining d as corresponding reconstruction properties, and calculating a reconstruction error and an average error of the multi-source target, and properties obtained." after the property reconstruction are expressed as: T = {y, je, nv, eV} (5), where v represents reconstruction properties of the multi-source target, V represents acquired target properties, and MN represents a set property threshold.
A multipron target tracking method according to claim 1, wherein step S5 comprises: calculating a reconstruction error by formula (7):
I J I l = [xix] (7).
The multi-source target tracking method of claim 1, wherein the multi-source target property data is one or more of image data, radar data, communication data, and position data.
The multi-source target tracking method of claim 1, wherein, in step S5, the generative model is a top-down generative model.
The multipron target tracking method of claim 1, wherein, in step S5, the reconstruction model is a bottom-up reconstruction model.

类似技术:

公开号 | 公开日 | 专利标题

KR20170106338A|2017-09-20|Model compression and fine-tuning

Liu et al.2020|Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers

Jadidi et al.2015|Mutual information-based exploration on continuous occupancy maps

Doulamis et al.2016|FAST-MDL: Fast Adaptive Supervised Training of multi-layered deep learning models for consistent object tracking and classification

Ba et al.2019|Blending diverse physical priors with neural networks

CN111652236A|2020-09-11|Lightweight fine-grained image identification method for cross-layer feature interaction in weak supervision scene

Kortylewski et al.2021|Compositional convolutional neural networks: A robust and interpretable model for object recognition under occlusion

Lin et al.2016|Land cover classification of RADARSAT-2 SAR data using convolutional neural network

WO2021046681A1|2021-03-18|Complex scenario-oriented multi-source target tracking method

NL2026432B1|2022-02-22|Multi-source target tracking method for complex scenes

An et al.2021|Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

Hao et al.2018|Research on image semantic segmentation based on FCN-VGG and pyramid pooling module

CN110059878B|2021-04-02|Photovoltaic power generation power prediction model based on CNN LSTM and construction method thereof

Zhang et al.2020|Self-blast state detection of glass insulators based on stochastic configuration networks and a feedback transfer learning mechanism

CN111667016A|2020-09-15|Incremental information classification method based on prototype

Suganthan et al.1995|Self-organizing Hopfield network for attributed relational graph matching

CN110781972A|2020-02-11|Increment unsupervised multi-mode related feature learning model

Mery et al.2021|Deep Learning in X-ray Testing

Trottier et al.2017|Convolutional residual network for grasp localization

Akbar et al.2011|Training neural networks using Clonal Selection Algorithm and Particle Swarm Optimization: A comparisons for 3D object recognition

Dai2020|Real-time and accurate object detection on edge device with TensorFlow Lite

Nakashika et al.2016|Modeling deep bidirectional relationships for image classification and generation

Bista et al.2021|Evaluating the Impact of Semantic Segmentation and Pose Estimation on Dense Semantic SLAM

CN109655672B|2021-01-22|Electromagnetic environment effect analysis method based on artificial intelligence

CN113987236B|2022-03-22|Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network

同族专利:

公开号 | 公开日

CN110569807A|2019-12-13|

NL2026432B1|2022-02-22|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

CN106204651B|2016-07-11|2018-11-02|上海凌科智能科技有限公司|A kind of method for tracking target based on improved judgement with generation conjunctive model|

CN108804715A|2018-07-09|2018-11-13|北京邮电大学|Merge multitask coordinated recognition methods and the system of audiovisual perception|

法律状态:

优先权:

申请号 | 申请日 | 专利标题

CN201910857949.6A|CN110569807A|2019-09-09|2019-09-09|multi-source target tracking method for complex scene|

[返回顶部]